Skip to content

CP-48676: Reuse pool sessions on slave logins #6258

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Mar 14, 2025

Conversation

snwoods
Copy link
Contributor

@snwoods snwoods commented Jan 28, 2025

This was first introduced in #5986 but was later reverted as the mutex it introduced caused a deadlock with the DB mutex. This PR has now been reworked to use an Atomic variable, removing the need for the mutex.

This PR allows pool sessions to be reused (if the flag is turned on, which it is by default). This greatly speeds up communications between hosts which is important for repetitive calls e.g. one customer was running get_servertime frequently on all hosts in a pool which was causing problems due to the length of each call.

When we get this reusable session, we can optionally validate the session before using it but this increases the duration of the call, so this is off by default. Excepting a toolstack restart, the reusable session should always be valid as it is excluded from deletion in destroy_db_session

@snwoods snwoods force-pushed the private/stevenwo/CA-400339 branch from 81c5fd5 to f17fb1b Compare January 28, 2025 12:53
Copy link
Member

@psafont psafont left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some minor style issues. The general approach seems quite sensible, even if a race can cause xapi to use more than one session (to avoid a loop when getting an invalid resusable session)

@snwoods snwoods force-pushed the private/stevenwo/CA-400339 branch from f17fb1b to b94335a Compare January 31, 2025 15:52
@snwoods
Copy link
Contributor Author

snwoods commented Feb 3, 2025

The Ring3: BVT + BST suiteruns I have run were all green, both with (211574) and without (211571) additional debug logging. The debug suiterun shows that the pool sessions are being reused as expected.

@snwoods snwoods force-pushed the private/stevenwo/CA-400339 branch from b94335a to 3efb3bb Compare February 4, 2025 11:48
@snwoods snwoods force-pushed the private/stevenwo/CA-400339 branch from 3efb3bb to 151686b Compare February 20, 2025 09:44
@snwoods
Copy link
Contributor Author

snwoods commented Feb 26, 2025

During testing (BVT + BST etc.) the resuable_pool_session never became invalid because we prevent the resuable_pool_session from being destroyed in destroy_db_session. Something I'm unsure about though is, is there any possible situation where the session does become invalid (excluding a toolstack restart where a new session will be made anyway)? If that were to happen and Xapi_globs.validate_reusable_pool_session = false, is_valid_session would still return true (as session != null) and whatever the session is used for would fail. I'm unsure if we would then get out of this state.

I've marked all the comments as resolved where appropriate. This is still an outstanding question for me though.

@snwoods snwoods force-pushed the private/stevenwo/CA-400339 branch from 151686b to c6da552 Compare February 26, 2025 10:58
@edwintorok
Copy link
Contributor

edwintorok commented Mar 6, 2025

This is still being tested, apparently the member is not reusing the session enough times (we were trying to determine whether it is able to cope with a coordinator restart, and it seems it can, but then it is not trying to reuse the session either).

Putting back into draft until we figure out whether once it is reusing the session properly if it still works.

@edwintorok edwintorok marked this pull request as draft March 6, 2025 10:13
@edwintorok
Copy link
Contributor

OK looked at this in more detail, and the pool session was mostly actually created on the coordinator, so this is fine (it avoids lots of DB calls to create sessions, and reuses them).
The SM sessions doesn't yet take advantage of this, it'll need a separate PR to reuse SM sessions and avoid logouts on SM.

@edwintorok edwintorok marked this pull request as ready for review March 11, 2025 10:48
@snwoods
Copy link
Contributor Author

snwoods commented Mar 11, 2025

Yes as Edwin said, we were initially confused about which way the improvement should be going, hence the comment about the coordinator restarting. In actual fact, this PR is to avoid remaking sessions on the coordinator, so the coordinator or member restarting is not an issue (in the former case, the coordinator will get a new reusable session and in the latter, the reusable session will persist). As you can see in the image below, the difference from reusing the session is 1.1ms without this change and 0.05ms with the change.
{E1E89C6D-5B95-462F-9CAA-FE96AFEFA60F}

snwoods and others added 3 commits March 13, 2025 11:29
Prevent this reusable pool session from being destroyed so that it
remains valid. This feature can be toggled with the flag reuse-pool-sessions.

This commit has been reworked to use Atomic instead of a mutex to
prevent a deadlock as shown in CA-400339.

Signed-off-by: Steven Woods <steven.woods@citrix.com>
Add a new flag validate-reusable-pool-session to xapi globs which skips
the reusable pool session validity check if it is false. This saves time
as we are no longer calling pool.get_all for each session.

Signed-off-by: Steven Woods <steven.woods@citrix.com>
This can be reverted once we have sufficient confidence from running
nightlies

Signed-off-by: Steven Woods <steven.woods@cloud.com>
@snwoods snwoods force-pushed the private/stevenwo/CA-400339 branch from 1d07b7c to e42cca1 Compare March 13, 2025 11:33
@snwoods
Copy link
Contributor Author

snwoods commented Mar 13, 2025

I've added a commit which defaults reuse_pool_sessions to false, so this PR is a noop and safe to merge. We can then run some nightlies with the commit reverted and then revert the commit on master when we are confident.

@edwintorok edwintorok added this pull request to the merge queue Mar 14, 2025
Merged via the queue into xapi-project:master with commit 17514dc Mar 14, 2025
17 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants